# Graph neural network for colliding particles with an application to sea ice floe modeling

This is a project exploring graph neural network for modeling sea ice floe. 

All the codes have been tested on Windows and Linux machines with Anaconda and CUDA GPUs. The following instructions facilitate users to run codes for replication based on the CUDA GPU. However, in general, MacOS machines, with or without CUDA GPUs, should also work with modifications in the setup.

## Setup

Follow these steps to set up the project:

### Download Repository
1. Download this repository.
2. Navigate to the main project folder. It should contain four subfolders: `data`, `model training code`, `model analysis code` and `visualization`.
3. The `data` folder contains both training and testing data for 10 nodes and 30 nodes situations.
4. The `model training code` folder contains codes for train the model.
5. The `model analysis code` folder contains codes for evaluate the model and generate results.
6. The `visualization` folder contains the different visualization for different situations. The `dataset` subfolder contains the raw training data visualization in images and video. 
The `prediction` subfolder contains the proposed model prediction results in images and video.
The `generalization` subfolder contains the proposed model prediction results outside original time domain in images and video.
The `DA` subfolder contains the proposed model combining with ENKF or ETKF method prediction results in graphs.
The `results` subfolder contains the proposed model PCC and RMSE results in graphs.

### Download Data
1. Download data `data.zip` from [DataDryad](https://datadryad.org/stash/share/lUrgNzqfqc-dBmWDYRUfgcTe4h8MsFMOrjYrDYnVIVc).
2. Unzip the downloaded data and move the 'data' folder into the project's main folder.
3. Confirm that your project's folder now contains three subfolders: `analysis`, `model`, and `data`.
4. Inside the `data` folder, you should find various subfolders, such as `agents`, `agents_all`, `agents_temp`, `analysis_data`, `monkey_data`, `training_curve`, and `training_curve_temp`.

### Set up the Python environment
1. Download and install the [Anaconda distribution](https://www.anaconda.com/download) or just use the Python virtual environment.
2. Install [NumPy](https://numpy.org/install/) package.
3. Install [Pytorch](https://pytorch.org/get-started/locally/) package.
4. Install [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) package.
5. Install [Matplotlib](https://matplotlib.org/stable/install/index.html) package.
6. Install [SciPy](https://scipy.org/install/) package.
7. Install [OpenCV](https://pypi.org/project/opencv-python/) package.


## Training
After completing the setup process, follow the steps below to run the code.
1.  Navigate to `model training code` folder. `_10` and `_30` suffix means 10 nodes and 30 nodes situations, respectively.
2.  The data input path and training checkpoint output path can be changed in the code. 
3.  Run the code `CN_plot_10.py` or `CN_plot_30.py` for training.


### Training parameters

The hyperparameters are in the `params` dictionary variable in the `CN_plot_10.py` or `CN_plot_30.py`.

### Checkpoints saving

During training, model checkpoints are automatically saved in the set folder path with certain interval. All settings can be changed in `params` dictionary variable and `model_path` variable in the `CN_plot_10.py` or `CN_plot_30.py`  


## Analysis 

### Checkpoints evaluation

After training, `compare_CN_10.py` or `compare_CN_30.py` should be run. These python files evaluate the optimal checkpoint
based on the minimum loss on the test set.

### Model Results and visualization
For all codes for the results, the `MODEL_NAME` variable should be changed to the optimal checkpoint that is obtatined from the `compare_CN_10.py` or `compare_CN_30.py`.

Results and visualization including graphs and videos can be obtained after running `CN_plot_10.py` or `CN_plot_30.py`.
For generalization outside the original time domian, just to change the `total_time_steps` variable to any time that is greater 10,000. In the experiments, 
the results would be still plausible within 20,000 time steps.

For ENKF with the model, `ENKF_CN_plot_10.py` or `ENKF_CN_plot_30.py` should be run. For ETKF with the model, `ETKF_CN_plot_10.py` or `ETKF_CN_plot_30.py` should be run.
For all DA methods, the observation frequency can be changed with the variable `observation_freq` and number of ensembles can be changed with 
the variable `ensemble_number`.

## Reproducibility

It is known for PyTorch that 
> Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds (https://pytorch.org/docs/stable/notes/randomness.html#reproducibility).

Outcomes for model training on different machines vary when updates require GPU usage, resulting in discrepancies in learned neural weights despite using identical seeds (as detailed in these discussions on [PyTorch](https://discuss.pytorch.org/t/reproducibility-over-different-machines/63047) forum and [GitHub](https://github.com/pytorch/pytorch/issues/38219)). In general, the conclusions from the results still hold.
